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Figure 1 

BMr_HPPl_A 



1 CTAGTTTACT TCTACAATTT CGGATGGAAG GATTATGGTG TAGCGTCTCT TACTACTATC 60 
ILVYF YNF GWK DYGV ASL TTI 20 



61 CTAGATATGG TGAAGGTGAT GACATTTGCC TTACAGGAAG GAAAAGTAGC TATCCATTGT 120 
21LDMV KVM TFA LQEG KVA IHC 40 



121 CATGCAGGGC TTGGTCGAAC AGGT 144 
41HAGL GRT G 48 



BMr_HPPl_B 



1 GATGTCTTCT GGGCCCTCCT GTGGAACACA GTT 33 
IDVFW ALL WNT V 11 
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Figure 2 



1 GTGGCCCGGGAGGCGCCGAGGCCAGGTAGGTGCGATGGGCGTGCAGCCCCCCAACTTCTC 60 

1 WPGRRRGQVGAMGVQPPNFS 20 

61 CTGGGTGCTTCCGGGCCGGCTGGCGGGACTGGCGCTGCCGCGGCTCCCCGCCCACTACCA 120 

21 WVLPGRLAGLALPRLPAHYQ 40 

121 GTTCCTGTTGGACCTGGGCGTGCGGCACCTGGTGTCCCTGACGGAGCGCGGGCCCCCTCA 180 

41 FLLDLGVRHLVSLTERGPPH 60 



181 CAGCGACAGCTGCCCCGGCCTCACCCTGCACCGCCTGCGCATCCCCGACTTCTGCCCGCC 240 
61 SDSCPGLTLHRLRIPDFCPP 80 



241 GGCCCCCGACCAGATCGACCGCTTCGTGCAGATCGTGGACGAGGCCAACGCACGGGGAGA 300 
81 APDQIDRFVQIVDEANARGE 100 



301 GGCTGTGGGAGTGCACTGTGCTCTGGGCTTTGGCCGCACTGGCACCATGCTGGCCTGTTA 360 
101 AVGVHCALGFGRTGTMLACY 120 



361 CCTGGTGAAGGAGCGGGGCTTGGCTGCAGGAGATGCCATTGCTGAAATCCGACGACTACG 4 20 
121 LVKERGLAAGDAIAEIRRLR 140 



421 ACCCGGCCCCATCGAGACCTATGAGCAGGAGAAAGCAGTCTTCCAGTTCTACCAGCGAAC 480 

141 PGPIETYEQEKAVFQFYQRT 160 

4 81 GAAATAAGGGGCCTTAGTACCCTTCTACCAGGCCCTCACTCCCCTTCCCCATGTTGTCGA 540 

161 K*GALVPFYQALTPLPHVVD 180 



541 TGGGGCCAGAGATGAAGGGAAGTGGACTAAAGTATTAAACCCTCTAGCTCCCATTGGCTG 600 
181 GARDEGKWTKVLNPLAPIG* 200 



601 AAGACACTGAAGTAGCCCACCCCTGCAGGCAGGTCCTGATTGAAGGGGAGGCTTGTACTG 660 
201 RH*SSPPLQAGPD*RGGLYC 220 



661 CTTTGTTGAATAAATGAGTTTTACGAACCAGGGAAAAAAAAAAAAAAAAAAAAGAAAAAA 720 
221 FVE*MSFTNQGKKKKKKRKK 240 



721 AAAAAAAAAAAAAAAAAAAAAAAGAA 74 6 
241 KKKKKKKR 248 
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Figure 3 



1 ATGGCTAGAA TGAACCTCCC TGCTTCTGTG GACATTGCAT ACAAAAATGT GAGATTTCTT 60 
IMARM NLP ASV DIAY KNV RFL 20 



61 ATTACACACA ACCCAACCAA TACCTACTTT AATAGATTCT TACAGGAACT TAAGCAGGAT 120 
21ITHN PTN TYF NRFL QEL KQD 40 

121 GGAGTTACCA CCATAGTAAG AGTATGAAAA GCAACTTACA ACATTGCTCT TTTAGAGAAG 180 
41GVTT IVR V*K ATYN lAL LEK 60 

181 GGAAGCATCC AGGTTCCGGA CTGGCCTTTT GATGATGGTA CAGCACCATC CAGCCAGATA 240 
61GS1Q VPD WPF DDGT APS SQI 80 

241 ATTGATAACT GGTTAAAACT TATGAAAAAT AAATTTCATG AAGATCCTGG TTGTTGTATT 300 
81IDNW LKL MKN KFHE DPG CCI 100 

301 GCAATTCACT GTGTTGTAGG TTTTGGGTGA GCTCCAGTTG CTAGTTGCCC TAGCTTTAAT 360 
lOlAIHC VVG FG* APVA SOP SFN 120 

361 TGAAGGTGGA ATGAAATATG AAAATGTAGT ACAGTTCATC AGATAAAAGT GACATGGAAC 420 
121*RWN EI* KCS TVHQ IKV TWN 140 



421 TTTTAACAGC AAACAACTTT TGTATTTGGA GAAATATTGT CTTAAAATAT GCTTGCACCT 480 
141F*QQ TTF VFG EILS *NM LAP 160 



481 CAGAAATCCC AGAAATAACT GTTTCCTTCA G 511 
161 Q K S Q K * L F P S 171 
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Figure 4A 

1 CTCAGGCAGA ACTATGAGGC CAAGAGTGCT CATGCGCACC AGGCTTTCTT TTTGAAATTC 60 
ILRQN YEA KSA HAHQ AFF LKF 20 



61 GAGGAGCTGA AGGAGGTGAG CAAGGAGCAG CCCAGACTGG AGGCTGAGTA CCCTGCCAAC 120 
21EELK EVS KEQ PRLE AEY PAN 40 



121 ACCACCAAGA ACTGTTAACC ACATGTGCTA CCCTATGACC ACTCCAGGGT CAGGCTGACC 180 
41TTKN C*P HVL PYDH SRV RLT 60 



181 CAGCTGGAGG GAGAGCCTCA TTCTGACTAC ATCAATGCCA ACTTGGTCCC AGGCTACACC 240 
61QLEG EPH SDY IHAN LVP GYT 80 



241 CGCCCACAGG AGTTCATTGC CTCTCAGGGG CCTCTCAAGA AAACACTGGA GAACTTCTGG 300 
81RPQE FIA SQG PLKK TLE NFW 100 



301 CGGCTGGTGC GGGAGCAGCA GGTCCGCATC ATCATCATGC CGACCATCAG CATGGAGAAC 360 
101 RLVR EQQ VRI IIMP TIS MEN 120 



361 GGGAGGGTGC TGTGTGAGCA TTACTGGCTG ACCGACTCTA CCCCGGACAC CCATGGTCAC 420 
121 GRVL CEH YWL TDST PDT HGH 140 



421 ATCACCATCC ACCTCCTAGC TGAGGAGCCT GAGGATGAGT GGACCAAGCG GGAATTCCAG 480 
141 ITIH LLA EEP EDEW TKR EFQ 160 



481 CTGCAGCACG TTGTCCAGCA ACATCAACGG AGGGTGGAGC AACTGCAGTT CACCACCTGA 540 
161 LQHV VQQ HQR RVEQ LQF TT* 180 



541 TCCGACCACA GCATCCTTGA GGCTCCCAGC TCCCTGCTCG CCTTTATGGA GCTGGTACAG 600 
181 SDKS ILE APS SLLA FME LVQ 200 



601 TAGCAGGCAA GGGCCACCCA GGGCGTGGGA CCCATCCTGG TGCACTGCAG GGGCTGTCCC 660 

201 *QAR ATQ GVG PILV HCR GCP 220 

561 TGCGGTGTGG GCATGGGCCG GACAGGCACC TTCGTGGCCC TGTCGAGGCT GCTGCAGCAG 720 

221 CGVG MGR TGT FVAL SRL LQQ 240 

721 CTGGAGGAGG AGCAGATGGT AGACGTGTTC CATGCTGTGT ATGCACTCCG GATGCACCAG 780 

241 LEEE QMV DVF HAVY ALR MHQ 260 



7 81 CCCCTCATGA TCCAGACCCT GAGCCAGTAC GTCTTCCTGC ACAGCTGCCT ACTGAACAAG 840 
251 PLMI QTL SQY VFLH SOL LNK 280 



841 ATTCTGGAAG GACCCTTCAA CATCTCTGAG TCTTGGCCCA TCTCTGTGAC GGACCTCCCG 900 
281 ILEG PFN ISE SWPI SVT DLP 300 



901 CAGGCGTGTG CCAAGAGGGC AGCCAGTGCC AATGCTGGCT TCTTGAAGGA GTACGAGGCC 960 
301QACA KRA ASA NAGF LKE YEA 320 



951 ATCAAGGACG AGGCTGGCTT TTCCGCACCC CCGCCTGGCT ATGAGCAGGA CAGCCCCGTC 1020 
321 IKDE AGF SAP PPGY EQD SPV 340 



1021 TCCTATGACC GTTCTCAGGG GCAGTTTTCT CCGGTGGAGG AGAGCCCCCC TGACGACRTG 1080 
341SYDR SQG QFS PVEE SPP DDM 360 



1081 CCTCTCTGGA AGCCAATGAT CTGTGCTCTG CAGGGTGGGC CCTCTGGCCG TGATCATACG 1140 
361PLWK PMI CAL QGGP SGR DHT 380 
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Figure 4B 



1141 GTGCTGACTG GCCCCGCAGG GCC2VAAGGAG CTCTGGGAGC TGGTGTGGCA GCACAGGGCT 1200 
381 VLTG PAG PKE LWEL VWQ HRA 400 



1201 CATGTGCTTG TCTCTCTTTG CCCACCCAAT GTCATGGAGA AGGAATTCTG GCCAACGGAG 1260 
401 HVLV SLC PPN VMEK EFW PTE 420 



1261 ATGCAGCCCG TAGTCACAGA CATGGTGACG GTGCACTGGG TGGCTGAGAG CAGCACAGCA 1320 
421 MQPV VTD MVT VHWV AES STA 440 



1321 GGCTGGTTCT GTACCCTCCT CAGGGTCACA CATGGGGAGA GCAGGAAGGA AAGGGAGGTG 1380 
441 GWFC TLL RVT HGES RKE REV 460 



1381 CAGAGACTGC AATTTCCATA CCTGGAGCCT GGGCATGAGC TGCCCGCCAC CACCCTGCTG 1440 
461QRLQ FPy LEP GHEL PAT TLL 480 



1441 CCCTTCCTGG CTGCTGTGGG CCAGTGCTGC TCTCGGGGCA ACAACAAGAA GCCGGGCACA 1500 

481 PFLA AVG QCC SRGN NKK PGT 500 

1501 CTGCTCAGCC RCTCCAACAA GGGTGCAACC CAGCTGGGCA CCTTCCTGGC CATGGAGCAG 1560 

501LLSH SNK GAT QLGT FLA MEQ 520 

1561 CTGCTGCAGC AGGCAGGGTC TGAGTGCACC GTGGATATCT TTAACGTGGC CCTGCAGCAG 1620 

521 LLQQ AGS ECT VDIF NVA LQQ 540 



1621 TCTCAGGCCT GTGGCCTTAT GACCCCAACA CTGAAGCAGT ATGTCTACCT CTACAACTGT 1680 
541SQAC GLM TPT LKQY VYL YNC 560 



1581 CTGAACAGCG CGCTGGCAGA CGGGCTGCCC 1710 
561 LNSA LAD GLP 570 
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Figure 5 A 

1 ATGTTCATTTTAAAAAACTTCAGGATGGGCACAAACACACAGAAGTGGGAAATGAATAAA 60 

61 AGAGTATTGATAAATTTTTGAAAATTGTTGAAGCTGAGTAATGGGCTTTCAGTCCAGTGT 120 

121 AAAGCTGTTGGAGCGCGGGAGCAAAGGTAAAGAATGATGTAATGCGCTGGCTGCTCCAAA 180 

181 GCATCTTTTGTTGTGGAATGGTTATTCCAGTCATCTCTTTATGAATCAAATGTGAGGGGC 24 0 

241 TGCTTTGTGGACGGAGTCCTTTGCAAGAGCACATCAACGGGAAAGAGAAAGAGACATTCA 300 



3 01 CTTGGAGGGCTCTTGCTGAAAATGGGTTTAACTCTCCTTTTGCCAGTCACCACCAGCCTG 360 



361 ACCTCATACACTTTTAGTACAATGGAGTGGCTGAGCCTTTGAGCACACCACCATTACATC 420 



421 ATCGTGGCAAATTAAAGAAGGAGGTGGGAAAAGAGGACTTATTGTTGTCATGGCCCATGA 480 
1 M A H E 4 



4 81 GATGATTGGAACTCAAATTGTTACTGAGAGGTTGGTGGCTCTGCTGGAAAGTGGAACGGA 540 
5 MIGTQIVTERLVALLESGTE 24 



541 AAAAGTGCTGCTAATTGATAGCCGGCCATTTGTGGAATACAATACATCCCACATTTTGGA 600 
25 KVLLIDSRPFVEYNTSHILE 44 



601 AGCCATTAATATCAACTGCTCCAAGCTTATGAAGCGAAGGTTGCAACAGGACAAAGTGTT 660 
45 AININCSKLMKRRLQQDKVL 64 



661 AATTACAGAGCTCATCCAGCATTCAGCGAAACATAAGGTTGACATTGATTGCAGTCAGAA 720 
65 ITELIQHSAKHKVDIDCSQK 84 



721 GGTTGTAGTTTACGATCAAAGCTCCCAAGATGTTGCCTCTCTCTCTTCAGACTGTTTTCT 7 80 
85 VVVYDQSSQDVASLSSDCFL 104 



7 81 CACTGTACTTCTGGGTAAACTGGAGAAGAGCTTCAACTCTGTTCACCTGCTTGCAGGTGG 84 0 
105 TVLLGKLEKS FNSVHLLAGG 124 



841 GTTTGCTGAGTTCTCTCGTTGTTTCCCTGGCCTCTGTGAAGGAAAATCCACTCTAGTCCC 900 
125 FAEFSRCFPGLCEGKSTLVP 144 



901 TACCTGCATTTCTCAGCCTTGCTTACCTGTTGCCAACATTGGGCCAACCCGAATTCTTCC 9 60 
145 TCISQPCLPVANIGPTRILP 164 
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Figure 5B 

9 6 1 CAATCTTTATCTTGGCTGCCAGCGAGATGTCCTCAACAAGGAGCTGATGCAGCAGAATGG 1020 

165 NLYLGCQRDVLNKELMQQNG 184 

1021 GATTGGTTATGTGTTAAATGCCAGCAATACCTGTCCAAAGCCTGACTTTATCCCCGAGTC 1080 

185 IGYVLNASNTCPKPDFIPES 204 



1081 TCATTTCCTGCGTGTGCCTGTGAATGACAGCTTTTGTGAGAAAATTTTGCCGTGGTTGGA 1140 
205 HFLRVPVNDSFCEKILPWLD 224 



1141 CAAATCAGTAGATTTCATTGAGAAAGCAAAAGCCTCCAATGGATGTGTTCTAGTGCACTG 1200 
225 KSVDFIEKAKASNGCVLVHC 244 



1201 TTTAGCTGGGATCTCCCGCTCCGCCACCATCGCTATCGCCTACATCATGAAGAGGATGGA 12 60 
245 LAGISRSATIAIAYIMKRMD 264 



1261 CATGTCTTTAGATGAAGCTTACAGATTTGTGAAAGAAAAAAGACCTACTATATCTCCAAA 1320 
265 MSLDEAYRFVKEKRPTISPN 284 



1321 CTTCAATTTTCTGGGCCAACTCCTGGCCTATGAGAAGAAGATTAAGAACCAGACTGGAGC 1380 

285 FNFLGQLLAYEKKIKNQTGA 304 

1381 ATCAGGGCCAAAGAGCAAACTCAAGCTGCTGCCCCTGGAGAAGCCAAATGAACCTGTCCC 144 0 

305 SGPKSKLKLLPLEKPNEPVP 324 



14 41 TGCTGTCTCAGAGGGTGGACAGAAAAGCGAGACGCCCCTCAGTCCACCCTGTGCCGACTC 1500 
325 AVSEGGQKSETPLSPPCADS 344 



1501 TGCTACCTCAGAGGCAGCAGGACAAAGGCCCGTGCATCCCGCCAGCGTGCCCAGCGTGCC 1560 
345 ATSEAAGQRPVHPASVPSVP 364 



1561 CAGCGTGCAGCCGTCGCTGTTAGAGGACAGCCCGCTGGTACAGGCGCTCAGTGGGCTGCA 1620 
365 SVQPSLLEDSPLVQALSGLH 384 



1 62 1 CCTGTCCGCAGACAGGCTGGAAGACAGCAATAAGCTCAAGCGTTCCTTCTCTCTGGATAT 1680 

385 LSADRLEDSNKLKRSFSLDI 404 

1681 CAAATCAGTTTCATATTCAGCCAGCATGGCAGCATCCTTACATGGCTTCTCCTCATCAGA 1740 

405 KSVSYSASMAASLHGFSSSE 424 



1741 AGATGCTTTGGAATACTACAAACCTTCCACTACTCTGGATGGGACCAACAAGCTATGCCA 1800 
425 DALEYYKPSTTLDGTNKLCQ 444 



1801 GTTCTCCCCTGTTCAGGAACTATCGGAGCAGACTCCCGAAACCAGTCCTGATAAGGAGGA 1860 
445 FSPVQELSEQTPETSPDKEE 464 
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Figure 5C 



1861 AGCCAGCATCCCCAAGAAGCTGCAGACCGCCAGGCCTTCAGACAGCCAGAGCAAGCGATT 1920 
465 AST PKKLQTARPSDSQSKRL 484 



1 921 GCATTCGGTCAGAACCAGCAGCAGTGGCACCGCCCAGAGGTCCCTTTTATCTCCACTGCA 1980 
485 HSVRTSSSGTAQRSLLSPLH 504 



1 981 TCGAAGTGGGAGCGTGGAGGACAATTACCACACCAGCTTCCTTTTCGGCCTTTCCACCAG 2040 
505 RSGSVEDNYHTSFLFGLSTS 524 



2041 CCAGCAGCACCTCACGAAGTCTGCTGGCCTGGGCCTTAAGGGCTGGCACTCGGATATCTT 2100 
525 QQHLTKSAGLGLKGWHSDIL 544 



2101 GGCCCCCCAGACCTCTACCCCTTCCCTGACCAGCAGCTGGTATTTTGCCACAGAGTCCTC 2160 
545 APQTSTPSLTSSWYFATESS 564 



2161 ACACTTCTACTCTGCCTCAGCCATCTACGGAGGCAGTGCCAGTTACTCTGCCTACAGCTG 2220 
565 HFYSASAIYGGSASYSAYSC 584 



2221 CAGCCAGCTGCCCACTTGCGGAGACCAAGTCTATTCTGTGCGCAGGCGGCAGAAGCCAAG 2280 
585 SQLPTCGDQVYSVRRRQKPS 604 



2281 TGACAGAGCTGACTCGCGGCGGAGCTGGCATGAAGAGAGCCCCTTTGAAAAGCAGTTTAA 234 0 
605 DRADSRRSWHEESPFEKQFK 624 



234 1 ACGCAGAAGCTGCCAAATGGAATTTGGAGAGAGCATCATGTCAGAGAACAGGTCACGGGA 2400 
625 RRSCQMEFGESIMSENRSRE 644 



2 4 01 AGAGCTGGGGAAAGTGGGCAGTCAGTCTAGCTTTTCGGGCAGCATGGAAATCATTGAGGT 24 60 
645 ELGKVGSQSSFSGSMEI lEV 664 



24 61 CTCCTGAGAAGAAAGACACTTGTGACTTCTATAGACAATTTTTTTTTCTTGTTCACAAAA 2520 
665 S 665 



2521 AAATTCCCTGGGAATCTGAAATATGTATGTGGGCATACATATATATTTTTGGAAAATGGA 2580 



2581 GCTATGGTGTAAAAGCAACAGGTGGATCAACCCAGTTGTTACTCTCTTAACATCTGCATT 2 64 0 



2641 TCAGAGATCAGCTAATACTTGCTCTCAACAAAAATGGAAGGGCAGATGCTAGAATCCCCC 2700 



2701 CTAGACGGAGGAAAACCATTTTATTCAGTGAATTACACATCCTCTTGTTCTTAAAAAAGC 2760 



27 61 AAGTGTCTTTGGTGTTGGAGGACAAAATCCCCTACCATTTTCACGTTGTGCTACTAAGAG 2820 



Figure 5D 

2821 ATCTCAAATATTAGTCTTTGTCCGGACCCTTCCATAGTACACCTTAGCGCTGAGACTGAG 2880 

2 8 81 CCAGCTTGGGGGTCAGGTAGGTAGACCCTGTTAGGGACAGAGCCTAGTGGTAAATCCAAG 294 0 

2 941 AGAAATGATCCTATCCAAAGCTGATTCACAAACCCACGCTCACCTGACAGCCGAGGGACA 3000 

3001 CGAGCATCACTCTGCTGGACGGACCATTAGGGGCCTTGCCAAGGTCTACCTTAGAGCAAA 3060 

30 61 CCCAGTACCTCAGACAGGAAAGTCGGGGCTTTGACCACTACCATATCTGGTAGCCCATTT 3120 

3121 TCTAGGCATTGTGAATAGGTAGGTAGCTAGTCACACTTTTCAGACCAATTCAAACTGTCT 3180 

3181 ATGCACAAAATTCCCGTGGGCCTAGATGGAGATAATTTTTTTTTCTTCTCAGCTTTATGA 324 0 

3241 AGAGAAGGGAAACTGTCTAGGATTCAGCTGAACCACCAGGAACCTGGCAACATCACGATT 3300 

3301 TAAGCTAAGGTTGGGAGGCTAACGAGTCTACCTCCCTCTTTGTAAATCAAAGAATTGTTT 3360 

3361 AAAATGGGATTGTCAATCCTTTAAATAAAGATGAACTTGGTTTCAAGCCAAATGTGAATT 3420 

34 21 TATTTGGGTTGGTAGCAGAGCAGCAGCACCTTCAAATTCTCAGCCAAAGCAGATGTTTTT 34 80 

34 81 GCCCTTTCTGCTTCACTGCATGGATACAGTTGGTAAAATGTAATAATATGGCAGAATTTT 3540 

3541 ATAGGAAACTTCCTAGGGAGGT7\AATTATGGGAAGATTAAGAAAGGTACAAATTGCTGAG 3600 

3601 GAGAAGCAGGAAACCTGTTTCCTTAGTGGCTTTTATCCCCTCGGCATGCGATGGGGCTGA 3660 

3661 TGTTTCTATAATTGCCTCAGACTTTCACATTTACTAGTAGGGCTGAGAGAGGCTTTAGTG 37 20 

3721 AGGAAAGAATATTCAGAATAAAACGGTTGAGAAAGCTGAGAAGACCATTGAGTTTTGATC 37 80 

37 81 AGTTGTGAATAGAGTGCAAAGCCATGGCCAAGCTGTTTTTGGAAACGCTGGCCGGCGTGT 384 0 

3841 CTTCAGTGGAAAAAGCAAATCAAAATGGAGCGAGAGCAAAGGGGCGTCCTCAGTCCTCAA 3900 

3901 CCTACAATCACTGTATGGAATCGGTCCTGGCAGCTGAACATAGGAGGTCACTGGAACAAG 3960 

3961 TGATAGTGCAGATTGGCTTTCAAACATCCTCCTGGCTTGAGTTTTATCAGCTACAATGTG 4 020 
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Figure 5E 

4021 GGTCCTCTTTTGAAGCCTTAATTCACAACAGCAGCTTTTTGGGGGTGGGGCTGGGCGGGT 4080 

4 081 GTTGTCATTGTTCTTTCCCTTCCTGTAAGTGTCGCTAGTTGCTGCCTCGTATCTCAGGTT 414 0 

4141 TTTCTCTGTTTTTGAGAAATGGACAGTTTTTTGACCAGGATGTGACTTCATGTTTCCTAT 4200 

4201 GGTGACTTCTAAAACCAGCACAGAATGATATGACTCAACACAGACCGACTTGGTTATGGG 4260 

4261 GATGATGAGCCGCACAGACCTCACTAGTTGTGCACAAATAATGTGCTATGATGGGGTGTA 4320 

4321 AAGTGAAGGCAGAAGAGGGTCAGCCGCATTGTTATGATACTGGGAAAGTGCCGGTCAACG 4380 

4381 ATTTGAGTTAGTTTTTAGATATACATTGAAATCTTTAATCAGACATTCTCAAGTTTCACA 4 440 

4441 CAGTAGTTTTTGATGTTATGTACACACACACCAAATGTGTAACAGTTCACCACTTCCAGA 4500 

4501 GTGTGGTCATGCCCAAAACATGTTTAAGAAAGGAAAGCAGTAGCTCCTTGCTAACGATGT 4560 

4561 TTCAGGAGGTTTGGGGCACTTGGTTTTAATGAGCTTCTGTCATTTAGGGCTTCTCTTGGC 4620 

4 621 CATGGTCCCCTTCCTTCTGGAACTGTGATGTAGTCACATCCTACAGCCTTTAGTGCTGGT 4 680 

4 681 TCACTAGTGTCAGATAATCAGTTCTTGGAATCGAGACTGCCGTGGCGAAGGGGTGGCCTC 4740 

4741 GGAGGCAGGCTCTGGAGCTGCTTGGATGTCTTTAGGTGGGGTGGTGGCTGGCTCTCTTCA 4800 

4801 GCATGTAATTGGGGAAACCCTCGCGTCTACTAGGGGTGATACAGATGGTGATTTTAAAGA 48 60 

4861 GCAAAACTAGACTTCTATGTGAGAAGTGCTGGAAAATGATTTAGGACGTGTAAAGTTAGA 4 920 

4 921 TGGAAAGACTGTAAATGTTTAATATGAATATAGTGTTCTTTTGAAGTAAGGCCAGCTGTT 4 980 

4 981 GAACGGTTAAACTGTGCATTTCTCATTTTGATGTGTCATGTATGTTAATGTATGAAATGA 504 0 

5041 TTAAATAAAATCAAAACTGGTACCTGTTTATCCATAAAAAAAAAAAAAAAAAAAAAAAAA 5100 

5101 AAAAAAAAAAG 5111 
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Figure 6A 



BMY_HPP1_FL (1) 

BMY_HPP1_A (1) 

BMY_HPP1_B (1) 

HS_RPTPO (1) MGHLPTGIHGARRLLPLLWLFVLFKNATAFHVTVQDDNNIWSLEASDVI 

MM_RPTPO (1) 

PYP3_SP (1) 



BMY_HPP1_FL (1) 

BMY_HPP1_A (1) 

BMY_HPP1_B (1) 

HS_RPTPO (51) SPASVYVVKITGESKNYFFEFEEFNSTLPPPVIFKASYHGLYYIITLWV 

MM_RPTPO (1) 

PYP3_SP (1) 



BMY_HPP1_FL ( 1 ) 

BMY_HPP1_A (1) 

BMY_HPP1_B (1) 

HS_RPTPO aOl) NGNVVTKPSRSITVLTKPLPVTSVSIYDYKPSPETGVLFEIHYPEKYNVF 

MM_RPTPO (1) 

PYP3_SP (1) 



4^^ BMY_HPP1_FL (1) 

y1 BMY_HPP1_A (1) 

2 BMY_HPP1_B (1) 

U: HS_RPTPO (151) TRVNISYWEGKDFRTMLYKDFFKGKTVFNHWLPGMCYSNITFQLVSEATF 

==: MM_RPTPO (1) 

PYP3_SP (1) 



BMY_HPP1_FL (1) 

BMY_HPP1_A (1) 

BMY_HPP1_B (1) 

HS_RPTPO (201) NKSTLVEYSGVSHEPKQHRTAPYPPQNISVRIVNLNKNNWEEQSGNFPEE 

MM_RPTPO (1) 

PYP3_SP (1) 



BMY_HPP1_FL (1) ____ 

BMY_HPP1_A (1) Z 

BMY_HPP1_B (1) 1'_ 

HS_RPTPO (251) SFMRSQDTIGKEKLFHFTEETPEIPSGNISSGWPDFNSSDYETTSOPYWW 

MM_RPTPO ( 1 ) 

PYP3_SP (1) 



BMY_HPP1_FL (1) 

BMY_HPP1_A (1) 

BMY_HPP1_B (1) 

HS_RPTPO (301) DSASAAPESEDEFVSVLPMEYENNSTLSETEKSTSGSFSFFPVOMILTWL 

MM_RPTPO (1) 

PYP3_SP (1) 



BMY__HPP1_FL (1) 

BMY_HPP1_A (1) 

BMY_HPP1_B (1) 

HS_RPTPO (351) PPKPPTAFDGFHIHIEREENFTEYLMVDEEAHEFVAELKEPGKYKLSVTT 

MM_RPTPO (1) 

PYP3_SP (1) 
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FIGURE 6B 



BMY_HPP1_FL 
BMY_HPP1_A 
BMY_HPP1_B 
HS_RPTPO 
1XM_RPTP0 
PYP3_SP 



(401) FS SSGSCETRKSQSAKSLSFYI SPSGEWIEELTEKPQHVSVHVLS STTAL 



BMY_HPP1_FL 
BMY_HPP1_A 
BMY_HPP1_B 
HS_RPTPO 
MM_RPTPO 
PYP3 SP 



(451) MSWTSSQENYNSTIVSWSLTCQKQKESQRLEKQYCTQVNSSKPIIENLV 

(1) 
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HS_RPTPO 
MM_RPTPO 
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(501) PGAQYQWIYLRKGPLIGPPSDPVTFAIVPTGIKDLMLYPLGPTAWLSW 
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FIGURE 6D 
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Figure 7A 
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FIGURE 7B 
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Figure 8 
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Figure 9A 
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Figure 9B 
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Figure 11 
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Figure 12. 
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Figure 13A 

1 GAAAAGAAGACGAGGAGGAGAGCGACGGGACGGGACGCGAGCGGGAGCGCAGCCGCCCTC 60 

61 TCGGCTCCGCGGCGGCGCCTCGCAAGTCCGGGAGGCGAGGGGGGCCCGAGGGGAGACGCC 120 

121 GTGACAACTTTCGTTTCCCTCTGAGGGAATTGGGAGGTCGGCGGCCCCAAAAGCTTTCAG 180 

181 TCCAGTGTAAAGCTGTTGGAGCGCGGGAGCAAAGGTAAAGAATGATGTAATGCGCTGGCT 24 0 

241 GCTCCAAAGCATCTTTTGTTGTGGAATGGTTATTCCAGTCATCTCTTTATGAATCAAATG 300 

301 TGAGGGGCTGCTTTGTGGACGGAGTCCTTTGCAAGAGCACATCAACGGGAAAGAGAAAGA 360 

361 GACATTCACTTGGAGGGCTCTTGCTGAAAATGGGTTTAACTCTCCTTTTGCCAGTCACCA 420 

421 CCAGCCTGACCTCATACACTTTTAGTACAATGGAGTGGCTGAGCCTTTGAGCACACCACC 4 80 

4 81 ATTACATCATCGTGGCAAATTAAAGAAGGAGGTGGGAAAAGAGGACTTATTGTTGTCATG 54 0 
1 Ml 

541 GCCCATGAGATGATTGGAACTCAAATTGTTACTGAGAGGTTGGTGGCTCTGCTGGAAAGT 600 

2AHEMIGTQIVTERLVALLES 21 

601 GGAACGGAAAAAGTGCTGCTAATTGATAGCCGGCCATTTGTGGAATACAATACATCCCAC 660 

22GTEKVLLIDSRPFVEYNTSH 41 

6 61 ATTTTGGAAGCCATTAATATCAACTGCTCCAAGCTTATGAAGCGAAGGTTGCAACAGGAC 720 

42ILEAININCSKLMKRRLQQD 61 

721 AAAGTGTTAATTACAGAGCTCATCCAGCATTCAGCGAAACATAAGGTTGACATTGATTGC 780 

62KVLITELIQHSAKHKVDIDC 81 

781 AGTCAGAAGGTTGTAGTTTACGATCAAAGCTCCCAAGATGTTGCCTCTCTCTCTTCAGAC 84 0 

82SQKVVVYDQSSQDVASLSSD 101 

841 TGTTTTCTCACTGTACTTCTGGGTAAACTGGAGAAGAGCTTCAACTCTGTTCACCTGCTT 900 

102CFLTVLLGKLEKSFNSVHLL 121 

901 GCAGGTGGGTTTGCTGAGTTCTCTCGTTGTTTCCCTGGCCTCTGTGAAGGAAAATCCACT 960 

122AGGFAEFSRCFPGLCEGKST 141 
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Figure 13B 



961 CTAGTCCCTACCTGCATTTCTCAGCCTTGCTTACCTGTTGCCAACATTGGGCCAACCCGA 1020 
142 LVPTCISQPCLPVANI G P T R 161 



1021 ATTCTTCCCAATCTTTATCTTGGCTGCCAGCGAGATGTCCTCAACAAGGAGCTGATACAG 1080 

162 ILPNLYLGCORDVL NKELIO 181 

1081 CAGAATGGGATTGGTTATGTGTTAAATGCCAGCTATACCTGTCCAAAGCCTGACTTTATC 114 0 

182 ONG IG YVLNASYTCPKPDFI 201 



1141 CCCGAGTCTCATTTCCTGCGTGTGCCTGTGAATGACAGCTTTTGTGAGAAAATTTTGCCG 1200 
202 PESHFLRVPVNDSFCEKILP 221 



1201 TGGTTGGACAAATCAGTAGATTTCATTGAGAAAGCAAAAGCCTCCAATGGATGTGTTCTA 1260 
222 WLDKSVDFIEKAKASNGCVL 241 



12 61 GTGCACTGTTTAGCTGGGATCTCCCGCTCCGCCACCATCGCTATCGCCTACATCATGAAG 1320 
242 VHIlAGISRSATIAIAYIMK 261 



1321 AGGATGGACATGTCTTTAGATGAAGCTTACAGATTTGTGAAAGAAAAAAGACCTACTATA 138 0 
262 RMDMSLDEAYRFVKEK RPTI 281 



1381 TCTCCAAACTTCAATTTTCTGGGCCAACTCCTGGACTATGAGAAGAAGATTAAGAACCAG 1440 
282 SPNFNFLGOLLDYEKK I K N Q 301 



1441 ACTGGAGCATCAGGGCCAAAGAGCAAACTCAAGCTGCTGCACCTGGAGAAGCCAAATGAA 1500 
302 TGASGPKSKLKLLHLEKPNE 321 



1501 CCTGTCCCTGCTGTCTCAGAGGGTGGACAGAAAAGCGAGACGCCCCTCAGTCCACCCTGT 1560 
322 PVPAVSEGGQKSETPLSPPC 341 



15 61 GCCGACTCTGCTACCTCAGAGGCAGCAGGACAAAGGCCCGTGCATCCCGCCAGCGTGCCC 1620 

342 ADSATSEAAGQRPVHPASVP 361 

1621 AGCGTGCCCAGCGTGCAGCCGTCGCTGTTAGAGGACAGCCCGCTGGTACAGGCGCTCAGT 1680 

362 SVPSVQPSLLEDSPLVQALS 381 

1681 GGGCTGCACCTGTCCGCAGACAGGCTGGAAGACAGCAATAAGCTCAAGCGTTCCTTCTCT 17 40 

382 GLHLSADRLEDSNKLKRSFS 401 

17 41 CTGGATATCAAATCAGTTTCATATTCAGCCAGCATGGCAGCATCCTTACATGGCTTCTCC 1800 

402 LDIKSVSYSASMAASLHGFS 421 



1801 TCATCAGAAGATGCTTTGGAATACTACAAACCTTCCACTACTCTGGATGGGACCAACAAG 18 60 
422 SSEDALEYYKPSTTLDGTNK 441 
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1861 CTATGCCAGTTCTCCCCTGTTCAGGAACTATCGGAGCAGACTCCCGAAACCAGTCCTGAT 1920 
442 LCQFSPVQELSEQTPETSPD 461 



1921 AAGGAGGAAGCCAGCATCCCCAAGAAGCTGCAGACCGCCAGGCCTTCAGACAGCCAGAGC 1980 
462 KEEASIPKKLQTARPSDSQS 481 



1981 AAGCGATTGCATTCGGTCAGAACCAGCAGCAGTGGCACCGCCCAGAGGTCCCTTTTATCT 2040 

482 KRLHSVRTSSSGTAQRSLLS 501 

204 1 CCACTGCATCGAAGTGGGAGCGTGGAGGACAATTACCACACCAGCTTCCTTTTCGGCCTT 2100 

502 PLHRSGSVEDNYHTSFLFGL 521 



2101 TCCACCAGCCAGCAGCACCTCACGAAGTCTGCTGGCCTGGGCCTTAAGGGCTGGCACTCG 2160 
522 STSQQHLTKSAGLGLKGWHS 541 



2161 GATATCTTGGCCCCCCAGACCTCTACCCCTTCCCTGACCAGCAGCTGGTATTTTGCCACA 2220 
542 DILAPQTSTPSLTSSWYFAT 561 



2221 GAGTCCTCACACTTCTACTCTGCCTCAGCCATCTACGGAGGCAGTGCCAGTTACTCTGCC 228 0 
562 ESSHFYSASAIYGGSASYSA 581 



2281 TACAGCTGCAGCCAGCTGCCCACTTGCGGAGACCAAGTCTATTCTGTGCGCAGGCGGCAG 234 0 
582 YSCSQLPTCGDQVYSVRRRQ 601 



2341 AAGCCAAGTGACAGAGCTGACTCGCGGCGGAGCTGGCATGAAGAGAGCCCCTTTGAAAAG 24 00 
602 KPSDRADSRRSWHEESPFEK 621 



24 01 CAGTTTAAACGCAGAAGCTGCCAAATGGAATTTGGAGAGAGCATCATGTCAGAGAACAGG 24 60 
622 QFKRRSCQMEFGESIMSENR 641 



24 61 TCACGGGAAGAGCTGGGGAAAGTGGGCAGTCAGTCTAGCTTTTCGGGCAGCATGGAAATC 2 520 

642 SREELGKVGSQSSFSGSMEI 661 

2521 ATTGAGGTCTCCTGAGAAGAAAGACACTTGTGACTTCTATAGACAATTTTTTTTTTCTTG 2580 

662 I E V S 665 



2581 TTCACAAAAAAATTCCCTGTAAATCTGAAATATATATATGTACATACATATATATTTTTG 2640 



2641 GAAAATGGAGCTATGGTGTAAAAGCAACAGGTGGATCAACCCAGTTGTTACTCTCTTAAC 27 00 



2701 ATCTGCATTTGAGAGATCAGCTAATACTTCTCTCAACAAAAATGGAAGGGCAGATGCTAG 2760 



27 61 AATCCCCCCTAGACGGAGGAAAACCATTTTATTCAGTGAATTACACATCCTCTTGTTCTT 2820 
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Figure 13D 

2821 AAAAAAGCAAGTGTCTTTGGTGTTGGAGGACAAAATCCCCTACCATTTTCACGTTGTGCT 28 80 

2881 ACTAAGAGATCTCAAATATTAGTCTTTGTCCGGACCCTTCCATAGTACACCTTAGCGCTG 2 94 0 

2941 AGACTGAGCCAGCTTGGGGGTCAGGTAGGTAGACCCTGTTAGGGACAGAGCCTAGTGGTA 3000 

3001 AATCCAAGAGAAATGATCCTATCCAAAGCTGATTCACAAACCCACGCTCACCTGACAGCC 3060 

3061 GAGGGACACGAGCATCACTCTGCTGGACGGACCATTAGGGGCCTTGCCAAGGTCTACCTT 3120 

3121 AGAGCAAACCCAGTACCTCAGACAGGAAAGTCGGGGCTTTGACCACTACCATATCTGGTA 3180 

3181 GCCCATTTTCTAGGCATTGTGAATAGGTAGGTAGCTAGTCACACTTTTCAGACCAATTCA 3240 

324 1 AACTGTCTATGCACAAAATTCCCGTGGGCCTAGATGGAGATAATTTTTTTTTCTTCTCAG 3300 

3301 CTTTATGAAGAGAAGGGAAACTGTCTAGGATTCAGCTGAACCACCAGGAACCTGGCAACA 3360 

3361 TCACGATTTAAGCTAAGGTTGGGAGGCTAACGAGTCTACCTCCCTCTTTGTAAATCAAAG 3420 

3421 AATTGTTTAAAATGGGATTGTCAATCCTTTAAATAAAGATGAACTTGGTTTCAAGCCAAA 34S0 

3481 TGTGAATTTATTTGGGTTGGTAGCAGAGCAGCAGCACCTTCAAATTCTCAGCCAAAGCAG 354 0 

354 1 ATGTTTTTGCCCTTTCTGCTTCACTGCATGGATACAGTTGGTAAAATGTAATAATATGGC 3600 

3601 AGAATTTTATAGGAAACTTCCTAGGGAGGTAAATTATGGGAAGATTAAGAAAGGTACAAA 3660 

3661 TTGCTGAGGAGAAGCAGGAAACCTGTTTCCTTAGTGGCTTTTATCCCCTCGGCATGCGAT 3720 

3721 GGGGCTGATGTTTCTATGATTGCCTCAGACTTTCACATTTACTAGTAGGGCTGAGAGAGG 3780 

3781 CTTTAGTGAGGAAGGAATATTCAGAATAAAACGGTTGAGAAAGCTGAGAAGACCATTGAG 3840 

3841 TTTTGATCAGTTGTGAATAGAGTGCAAAGCCATGGCCAAGCTGTTTTTGGAAACGCTGGC 3900 

3901 CGGCGTGTCTTCAGTGGAAAAAGCAAATCAAAATGGAGCGAGAGCAAAGGGGCGTCCTCA 3960 
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Figure 13E 

3961 GTCCTCAACCTACAATCACTGTATGGAATCGGTCCTGGCAGCTGAACATAGGAGGTCACT 4020 

4021 GGAACAAGTGATAGTGCAGATTGGCTTTCAAACATCCTCCTGGCTTGAGTTTTATCAGCT 408 0 

4081 ACAATGTGGGTCCTCTTTTGAAGCCTTAATTCACAACAGCAGCTTTTTGGGGGTGGGGCT 414 0 

4141 GGGCGGGTGTTGTCATTGTTCTTTCCCTTCCTGTAAGTGTCGCTAGTTGCTGCCTCGTAT 4200 

4201 CTCAGGTTTTTCTCTGTTTTTGAGAAATGGACAGTTTTTTGACCAGGATGTGACTTCATG 4260 

4261 TTTCCTATGGTGACTTCTAAAACCAGCACAGAATGATATGACTCAACACAGACCGACTTG 4320 

4 321 GTTATGGGGATGATGAGCCGCACAGACCTCACTAGTTGTGCACAAATAATGTGCTATGAT 4380 

4381 GGGGTGTAAAGTGAAGGCAGAAGAGGGTCAGCCGCATTGTTATGATACTGGGAAAGTGCT 444 0 

4441 GGTCAACGATTTGAGTTAGTTTTTAGATATACATTGAAATCTTTAATCAGACATTCTCAA 4500 

4501 GTTTCACACAGTAGTTTTTGATGTTATGTACACACACACCAAATGTGTAACAGTTCACCA 4560 

4561 CTTCCAGAGTGTGGTCATGCCCAAAACATGTTTAAGAAAGGAAAGCAGTAGCTCCTTGCT 4620 

4 621 AACGATGTTTCAGGAGGTTTGGGGCACTTGGTTTTAATGAGCTTCTGTCATTTAGGGCTT 4 680 

4 681 CTCTTGGCCATGGTCCCCTTCCTTCTGGAACTGTGATGTAGTCACATCCTACAGCCTTTA 4 74 0 

4741 GTGCTGGTTCACTAGTGTCAGATAATCAGTTCTTGGAATCGAGACTGCCGTGGCGAAGGG 4 800 

4801 GTGGCCTCGGAGGCAGGCTCTGGAGCTGCTTGGATGTCTTTAGGTGGGGTGGTGGCTGGC 4 860 

4861 TCTCTTCAGCATGTAATTGGGGAAACCCTCGCGTCTACTAGGGGTGATACAGATGGTGAT 4 920 

4 921 TTTAAAGAGCAAAACTAGACTTCTATGTGAGAAGTGCTGGAAAATGATTTAGGACATGTG 4 980 

4981 TAAAGTTAGATGGAAAGACTGTAAATGTTTAATATGAATATAGTGTTCTTTTGAAGTAAG 5040 

5041 GCCAGCTGTTGAACGGTTAAACTGTGCATTTCTCATTTTGATGTGTCATGTATGTTAATG 5100 

5101 TATGAAATGATTAAATAAAATCAAAACTGGTACCTGTTTATACATAAATACGAGAAAAGA 5160 
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Figure 13F 

5161 CCTATCTTTGCAGCCATAAACTCGGTGGGAACACCACCACTCAAGTTGCCAAAGGAGGCA 5220 
5221 GTGGTGAAACCTGTCCTGTTCTCACTTAAATGAGGATTTAGCTCAAAATAAAGTGGTGGT 5280 
5281 GTCATCAGGTTTATTCCGTGTTCTGTCATTCACATGGAACACCGGATGATTAGCTAACAG 5340 
5341 TTTAGTGCCAGCCTTCATTCTTTACTGTGTACGTTAAATGCACACTACAGTGAAAAAGCC 54 00 
5401 TAAGACACTTGGTAAATATTTTCTAGCTGACTGATTCCAGAACACACAAG 5450 
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Figure 16A 



1 GACTGAGGTTGTCAGCCCAGTGTAAAGCTGTTGGAGTGAGGGCAGAAAGGTAAAGGATGA 60 

61 TGTAATGCCTGGCTGCCCTAGAGCATCTTTTGTTGTGGGATGGGTATTCCCATCATCTCT 120 

121 ATGAATCTAGTGTGAGGGGCTGCTTTGTGGAAGGAATCCTTTGCAAGAGCATATCAACAG 180 

181 GAAAGAGAAAGAGACATTCAGTTGGAGGGCTCTTGCTGAAATGGATTTAACTCTCCTCTT 240 

241 GCCAGTCACCACTAGCCTGACCTCATACATTTTTAGTACAATGGAGTGGCTGAGCCTTTG 300 

301 AGCACAGCACCATTACATCATCGTGGCAAATTAAAGAACGAGGTGGGGAAAGAGGACTTA 360 

361 TTGTTGTCATGGCCCATGAGATGATTGGAACTCAAATTGTTACTGAGAGCTTGGTGGCTC 420 

1 MAHEMIGTQIVTESLVAL 18 

421 TGCTGGAAAGTGGAACGGAAAAAGTGCTGCTAATTGATAGCCGACCATTTGTGGAATACA 480 

19 LESGTEKVLLIDSRPFVEYN 38 

4 81 ATACGTCTCACATTTTGGAAGCCATTAATATCAACTGCTCCAAACTGATGAAGCGAAGGT 54 0 

35 TSHILEAININCSKLMKRRL 58 

541 TGCAACAGGACAAAGTATTAATTACAGAACTAATCCACCAATCTACAAAGCATAAGGTTG 600 

59 QQDKVLITELIHQSTKHKVD 78 

601 ACATTGACTGCAATCAAAGAGTGGTAGTTTATGATCACAGTTCACAAGATGTTGGTTCTC 660 

79 IDCNQRVVVYDHSSQDVGSL 98 

661 TGTCGTCAGACTGCTTTCTCACTGTACTTCTGGGTAAGCTGGAGAGAAGCTTCAACTCTG 720 

95 SSDCFLTVLLGKLERSFNSV 118 

721 TCCACCTGCTTGCAGGTGGCTTTGCTGAGTTCTCTCGTTGTTTCCCTGGCCTCTGTGAAG 7 80 

119 HLLAGGFAEFSRCFPGLCEG 138 

781 GAAAGTCCACTCTAGTCCCTACCTGCATATCTCAGCCTTGCTTACCTGTTGCGAACATTG 84 0 

139 KSTLVPTCISQPCLPVANIg 158 

841 GGCCAACTCGAATTCTTCCCAATCTCTATCTTGGCTGCCAGCGAGATGTCCTCAACAAGG 900 

159 PTR ILPNLYLGCORDVLNKD 178 

901 ACCTGATGCAACAGAATGGGATTGGCTATGTGTTAAATGCCAGCAATACCTGTCCAAAGC 960 

179 LMOONGIGYVLNASNTCPKP 198 
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961 CTGACTTCATACCTGAATCTCACTTCCTGCGAGTGCCTGTGAATGACAGCTTTTGTGAGA 1020 
199 DFIPESHFLRVPVNDSFCEK 218 



1021 AAATCCTACCATGGTTGGACAAGTCTGTGGATTTCATTGAGAAAGCAAAAGCCTCCAATG 1080 
219 ILPWLDKSVDFIEKAKASNG 238 



1081 GCTGTGTGCTTATCCACTGCTTAGCTGGGATCTCTCGCTCCGCCACTATTGCTATTGCCT 1140 
239 CVLIHC LAG ISRSATIAIAY 258 



1141 ACATCATGAAGAGGATGGACATGTCTCTAGATGAGGCTTACAGATTTGTGAAAGAAAAAA 1200 

259 IMKRMDMSLDEAYRFVKEKR 278 

1201 GACCTACTATATCTCCGAATTTTAATTTTATGGGCCAACTCATGGACTATGAGAAGACGA 12 60 

279 PTISPNFNFMGOLMDYEKT I 298 



1261 TTAATAACCAGACTGGAATGTCAGGGCCAAAGAGCAAACTGAAGCTGCTGCACCTAGACA 1320 

299 NNQTGMSGPKSKLKLLHLDK 318 

1321 AACCCAGTGAGCCCGTGCCTGCAGCCTCAGAGGGCGGATGGAAGAGTGCACTGTCTCTCA 1380 

319 PSEPVPAASEGGWKSALSLS 338 

1381 GTCCACCCTGTGCCAACTCGACCTCGGAGGCATCAGGGCAAAGGCTTGTGCATCCTGCAA 14 40 

339 PPCANSTSEASGQRLVHPAS 358 



1441 GTGTGCCCCGCTTACAGCCGTCACTCTTAGAGGACAGTCCGCTGGTACAGGCGCTCAGTG 1500 

359 VPRLQPSLLEDSPLVQALSG 378 

1501 GGCTCCAGCTGTCCTCAGAGAAGCTGGAAGACAGCACTAAGCTCAAGCGTTCCTTCTCTC 1560 

379 LQLSSEKLEDSTKLKRSFSL 398 

15 61 TCGATATCAAATCTGTTTCATATTCAGCCAGTATGGCCGCGTCCCTACACGGCTTCTCGT 1620 

399 DIKSVSYSASMAASLHGFSS 418 



1621 CAGAGGAGGCTTTAGACTACTGCAAACCTTCTGCCACACTGGATGGGACCAACAAGCTCT 1680 
419 EEALDYCKPSATLDGTNKLC 438 



1681 GCCAGTTCTCCCCCGTTCAGGAGGTATCAGAACAGAGTCCAGAGACCAGCCCGGATAAGG 1740 
439 QFSPVQEVSEQSPETSPDKE 458 



1741 AGGAGGCCCACATCCCCAAGCAGCCCCAACCTCCCAGGCCTTCTGAGAGCCAGGTCACAC 1800 
459 EAHIPKQPQPPRPSESQVTR 478 



18 01 GCTTGCACTCAGTGAGAACCGGCAGTAGTGGGTCCACCCAGAGGCCCTTCTTCTCGCCAC 18 60 
479 LHSVRTGSSGSTQRPFFSPL 498 
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Figure 16C 

1861 TGCATCGGAGCGGGAGTGTAGAGGACAATTACCATACCAACTTCCTTTTTGGCCTTTCCA 1920 

499 HRSGSVEDNYHTNFLFGLST 518 

1921 CCAGCCAGCAACACCTCACCAAGTCTGCAGGGCTTGGCCTCAAGGGCTGGCACTCAGATA 1980 

519 SQQHLTKSAGLGLKGWHSDI 538 

1981 TTCTGGCTCCCCAGTCCTCTGCCCCCTCCCTGACCAGCAGTTGGTATTTTGCTACGGAGC 204 0 

539 LAPQSSAPSLTSSWYFATEP 558 

2041 CTTCACACTTGTACTCTGCTTCAGCCATCTATGGAGGCAACAGCAGTTACTCTGCCTACA 2100 

559 SHLYSASAIYGGNSSYSAYS 578 

2101 GCTGTGGCCAGCTGCCCACTTGCAGTGACCAAATCTATTCTGTTCGTAGGCGGCAGAAGC 2160 

579 CGQLPTCSDQIYSVRRRQKP 598 

2161 CTACTGACAGAGCTGACTCGAGGCGGAGCTGGCATGAAGAGAGCCCCTTTGAAAAGCAGT 2220 

599 TDRADSRRSWHEESPFEKQF 618 

2221 TTAAACGCAGAAGCTGCCAAATGGAATTTGGAGAGAGCATTATGTCGGAGAACAGGTCCA 22 80 

619 KRRSCQMEFGESIMSENRSR 638 

2281 GGGAGGAGCTGGGCAAGGTGGGCAGCCAGTCCAGCTTCTCCGGCAGCATGGAGATCATCG 234 0 

639 EELGKVGSQSSFSGSMEIIE 658 

2341 AGGTCTCTTGAGAAGACCTCGTCGCTTCTGTTGACAGTTTTGTTTCCTGTTCACAAAAAA 2400 

659 V S ggO 

2401 TAGTCCCTGTAAATCTGAAATATGTATATGTACATACATATATATTTTTGGAATATAGAG 24 60 

24 61 CTACGGTATAAAAGCAACAGATGGATCAACACAGTTGTTCTCTCAGCACCTGCACTGAGA 2520 

2521 ATAGCTAACTCTCAGAAAAGATTGGAAGGGTAGATGTTAGAATTCTCCCAGCCAGGAGAA 2580 

2581 GAGATTTGGTTCAGTGAATTGCACATCTTCTTGTTCCTACAAAAGCAAGGGTTTTGTTTG 2640 

2 641 TTTGTATGTTGTTTGTTTTTAATGTTAGAGGGCAAAATCCCTCCCATTTTCACGTGCAAC 2700 

2701 AGAGGTCTCAGAACTCATCTCTGTCCAGGCCCTTCCCTAGTGCACCTTAGCGCTAA 2756 
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Figure 19A 

1 GAAAAGAAGACGAGGAGGAGAGCGACGGGACGGGACGCGAGCGGGAGCGCAGCCGCCCTC 60 

61 TCGGCTCCGCGGCGGCGCCTCGCAAGTCCGGGAGGCGAGGGGGGCCCGAGGGGAGACGCC 120 

121 GTGACAACTTTCGTTTCCCTCTGAGGGAATTGGGAGGTCGGCGGCCCCAAAAGCTTTCAG 180 

181 TCCAGTGTAAAGCTGTTGGAGCGCGGGAGCAAAGGTAAAGAATGATGTAATGCGCTGGCT 240 

241 GCTCCAAAGCATCTTTTGTTGTGGAATGGTTATTCCAGTCATCTCTTTATGAATCAAATG 300 

301 TGAGGGGCTGCTTTGTGGACGGAGTCCTTTGCAAGAGCACATCAACGGGAAAGAGAAAGA 360 

3 61 GACATTCACTTGGAGGGCTCTTGCTGAAAATGGGTTTAACTCTCCTTTTGCCAGTCACCA 4 20 
421 CCAGCCTGACCTCATACACTTTTAGTACAATGGAGTGGCTGAGCCTTTGAGCACACCACC 4 80 

4 81 ATTACATCATCGTGGCAAATTAAAGAAGGAGGTGGGAAAAGAGGACTTATTGTTGTCATG 54 0 

1 Ml 



541 GCCCATGAGATGATTGGAACTCAAATTGTTACTGAGAGGTTGGTGGCTCTGCTGGAAAGT 600 
2AHEMIGTQIVTERLVALLES 21 



601 GGAACGGAAAAAGTGCTGCTAATTGATAGCCGGCCATTTGTGGAATACAATACATCCCAC 660 

22GTEKVLLIDSRPFVEYNTSH 41 

661 ATTTTGGAAGCCATTAATATCAACTGCTCCAAGCTTATGAAGCGAAGGTTGCAACAGGAC 720 

42ILEAININCSKLMKRRLQQD 61 

721 AAAGTGTTAATTACAGAGCTCATCCAGCATTCAGCGAAACATAAGGTTGACATTGATTGC 780 

62KVLITELIQHSAKHKVDIDC 81 



781 AGTCAGAAGGTTGTAGTTTACGATCAAAGCTCCCAAGATGTTGCCTCTCTCTCTTCAGAC 84 0 
82SQKVVVYDQSSQDVASLSSD 101 



841 TGTTTTCTCACTGTACTTCTGGGTAAACTGGAGAAGAGCTTCAACTCTGTTCACCTGCTT 900 
102CFLTVLLGKLEKSFNSVHLL 121 



901 GCAGGTGGGTTTGCTGAGTTCTCTCGTTGTTTCCCTGGCCTCTGTGAAGGAAAATCCACT 9 60 
122AGGFAEFSRCFPGLCEGKST 141 
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961 CTAGTCCCTACCTGCATTTCTCAGCCTTGCTTACCTGTTGCCAACATTGGGCCAACCCGA 1020 

142LVPTCISQPCLPVANIGPTR 161 

1021 ATTCTTCCCAATCTTTATCTTGGCTGCCAGCGAGATGTCCTCAACAAGGAGCTGATACAG 1080 

162ILPNLYLGCQRDVLNKELIQ 181 



1081 CAGAATGGGATTGGTTATGTGTTAAATGCCAGCTATACCTGTCCAAAGCCTGACTTTATC 

182 QNGIGYVLNASYTCPKPDFI 

1141 CCCGAGTCTCATTTCCTGCGTGTGCCTGTGAATGACAGCTTTTGTGAGAAAATTTTGCCG 

202 PESHFLRVPVNDSFCEKILP 

1201 TGGTTGGACAAATCAGTAGATTTCATTGAGAAAGCAAAAGCCTCCAATGGATGTGTTCTA 

222 WLDKSVDFIEKAKASNGCVL 

1261 GTGCACTGTTTAGCTGGGATCTCCCGCTCCGCCACCATCGCTATCGCCTACATCATGAAG 

242 VHCLAGISRSATIAIAYIMK 



1140 
201 



1200 
221 



1260 
241 



1320 
261 



1321 AGGATGGACATGTCTTTAGATGAAGCTTACAGATTTGTGAAAGAAAAAAGACCTACTATA 1380 

262 RMDMSLDEAYRFVKEKRPTI 281 

1381 TCTCCAAACTTCAATTTTCTGGGCCAACTCCTGGACTATGAGAAGAAGATTAAGAACCAG 1440 

282 SPNFNFLGQLLDYEKKIKNQ 301 

1441 ACTGGAGCATCAGGGCCAAAGAGCAAACTCAAGCTGCTGCACCTGGAGAAGCCAAATGAA 1500 

302 TGASGPKSKLKLLHLEKPNE 321 

1501 CCTGTCCCTGCTGTCTCAGAGGGTGGACAGAAAAGCGAGACGCCCCTCAGTCCACCCTGT 1560 

322 PVPAVSEGGQKSETPLSPPC 341 

15 61 GCCGACTCTGCTACCTCAGAGGCAGCAGGACAAAGGCCCGTGCATCCCGCCAGCGTGCCC 1620 

342 ADSATSEAAGQRPVHPASVP 361 

1621 AGCGTGCCCAGCGTGCAGCCGTCGCTGTTAGAGGACAGCCCGCTGGTACAGGCGCTCAGT 1680 

362 SVPSVQPSLLEDSPLVQALS 381 

1681 GGGCTGCACCTGTCCGCAGACAGGCTGGAAGACAGCAATAAGCTCAAGCGTTCCTTCTCT 17 40 

382 GLHLSADRLEDSNKLKRSFS 401 

17 41 CTGGATATCAAATCAGTTTCATATTCAGCCAGCATGGCAGCATCCTTACATGGCTTCTCC 1800 

402 LDIKSVSYSASMAASLHGFS 421 

1801 TCATCAGAAGATGCTTTGGAATACTACAAACCTTCCACTACTCTGGATGGGACCAACAAG 18 60 

422 SSEDALEYYKPSTTLDGTNK 441 
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1861 CTATGCCAGTTCTCCCCTGTTCAGGAACTATCGGAGCAGACTCCCGAAACCAGTCCTGAT 1920 

442 LCQFSPVQELSEQTPETSPD 461 

1921 AAGGAGGAAGCCAGCATCCCCAAGAAGCTGCAGACCGCCAGGCCTTCAGACAGCCAGAGC 1980 

462 KEEASIPKKLQTARPSDSQS 481 



1981 AAGCGATTGCATTCGGTCAGAACCAGCAGCAGTGGCACCGCCCAGAGGTCCCTTTTATCT 20 40 
482 KRLHSVRTSSSGTAQRSLLS 501 



2041 CCACTGCATCGAAGTGGGAGCGTGGAGGACAATTACCACACCAGCTTCCTTTTCGGCCTT 2100 
502 PLHRSGSVEDNYHTSFLFGL 521 



2101 TCCACCAGCCAGCAGCACCTCACGAAGTCTGCTGGCCTGGGCCTTAAGGGCTGGCACTCG 2160 
522 STSQQHLTKSAGLGLKGWHS 541 



2161 GATATCTTGGCCCCCCAGACCTCTACCCCTTCCCTGACCAGCAGCTGGTATTTTGCCACA 2220 
542 DILAPQTSTPSLTSSWYFAT 561 



2221 GAGTCCTCACACTTCTACTCTGCCTCAGCCATCTACGGAGGCAGTGCCAGTTACTCTGCC 228 0 
562 ESSHFYSASAIYGGSASYSA 581 



2281 TACAGCTGCAGCCAGCTGCCCACTTGCGGAGACCAAGTCTATTCTGTGCGCAGGCGGCAG 2340 

582 YSCSQLPTCGDQVYSVRRRQ 601 

234 1 AAGCCAAGTGACAGAGCTGACTCGCGGCGGAGCTGGCATGAAGAGAGCCCCTTTGAAAAG 2400 

602 KPSDRADSRRSWHEESPFEK 621 



2401 CAGTTTAAACGCAGAAGCTGCCAAATGGAATTTGGAGAGAGCATCATGTCAGAGAACAGG 24 60 
622 QFKRRSCQMEFGESIMSENR 641 



2 4 61 TCACGGGAAGAGCTGGGGAAAGTGGGCAGTCAGTCTAGCTTTTCGGGCAGCATGGAAATC 2520 
642 SREELGKVGSQSSFSGSMEI 661 



2521 ATTGAGGTCTCCTGAGAAGAAAGACACTTGTGACTTCTATAGACAATTTTTTTTTTCTTG 2580 



2581 TTCACAAAAAAATTCCCTGTAAATCTGAAATATATATATGTACATACATATATATTTTTG 2640 



2 641 GAAAATGGAGCTATGGTGTAAAAGCAACAGGTGGATCAACCCAGTTGTTACTCTCTTAAC 2700 



2701 ATCTGCATTTGAGAGATCAGCTAATACTTCTCTCAACAAAAATGGAAGGGCAGATGCTAG 2760 



2761 AATCCCCCCTAGACGGAGGAAAACCATTTTATTCAGTGAATTACACATCCTCTTGTTCTT 2820 
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Figure 19D 

2821 AAAAAAGCAAGTGTCTTTGGTGTTGGAGGACAAAATCCCCTACCATTTTCACGTTGTGCT 2880 

2881 ACTAAGAGATCTCAAATATTAGTCTTTGTCCGGACCCTTCCATAGTACACCTTAGCGCTG 2 94 0 

2 941 AGACTGAGCCAGCTTGGGGGTCAGGTAGGTAGACCCTGTTAGGGACAGAGCCTAGTGGTA 3000 

3001 AATCCAAGAGAAATGATCCTATCCAAAGCTGATTCACAAACCCACGCTCACCTGACAGCC 30 60 

3061 GAGGGACACGAGCATCACTCTGCTGGACGGACCATTAGGGGCCTTGCCAAGGTCTACCTT 3120 

3121 AGAGCAAACCCAGTACCTCAGACAGGAAAGTCGGGGCTTTGACCACTACCATATCTGGTA 3180 

3181 GCCCATTTTCTAGGCATTGTGAATAGGTAGGTAGCTAGTCACACTTTTCAGACCAATTCA 3240 

3241 AACTGTCTATGCACAAAATTCCCGTGGGCCTAGATGGAGATAATTTTTTTTTCTTCTCAG 3300 

3301 CTTTATGAAGAGAAGGGAAACTGTCTAGGATTCAGCTGAACCACCAGGAACCTGGCAACA 3360 

3361 TCACGATTTAAGCTAAGGTTGGGAGGCTAACGAGTCTACCTCCCTCTTTGTAAATCAAAG 3420 

3421 AATTGTTTAAAATGGGATTGTCAATCCTTTAAATAAAGATGAACTTGGTTTCAAGCCAAA 3480 

34 81 TGTGAATTTATTTGGGTTGGTAGCAGAGCAGCAGCACCTTCAAATTCTCAGCCAAAGCAG 354 0 

354 1 ATGTTTTTGCCCTTTCTGCTTCACTGCATGGATACAGTTGGTAAAATGTAATAATATGGC 3600 

3601 AGAATTTTATAGGAAACTTCCTAGGGAGGTAAATTATGGGAAGATTAAGAAAGGTACAAA 3660 

3661 TTGCTGAGGAGAAGCAGGAAACCTGTTTCCTTAGTGGCTTTTATCCCCTCGGCATGCGAT 3720 

3721 GGGGCTGATGTTTCTATGATTGCCTCAGACTTTCACATTTACTAGTAGGGCTGAGAGAGG 3780 

3781 CTTTAGTGAGGAAGGAATATTCAGAATAAAACGGTTGAGAAAGCTGAGAAGACCATTGAG 384 0 

3841 TTTTGATCAGTTGTGAATAGAGTGCAAAGCCATGGCCAAGCTGTTTTTGGAAACGCTGGC 3900 

3901 CGGCGTGTCTTCAGTGGAAAAAGCAAATCAAAATGGAGCGAGAGCAAAGGGGCGTCCTCA 3960 

3961 GTCCTCAACCTACAATCACTGTATGGAATCGGTCCTGGCAGCTGAACATAGGAGGTCACT 4 020 



D0072 NP 

Figure 19E 

4021 GGAACAAGTGATAGTGCAGATTGGCTTTCAAACATCCTCCTGGCTTGAGTTTTATCAGCT 4080 

4081 ACAATGTGGGTCCTCTTTTGAAGCCTTAATTCACAACAGCAGCTTTTTGGGGGTGGGGCT 4140 

4141 GGGCGGGTGTTGTCATTGTTCTTTCCCTTCCTGTAAGTGTCGCTAGTTGCTGCCTCGTAT 4200 

4201 CTCAGGTTTTTCTCTGTTTTTGAGAAATGGACAGTTTTTTGACCAGGATGTGACTTCATG 4260 

4261 TTTCCTATGGTGACTTCTAAAACCAGCACAGAATGATATGACTCAACACAGACCGACTTG 4320 

4 321 GTTATGGGGATGATGAGCCGCACAGACCTCACTAGTTGTGCACAAATAATGTGCTATGAT 4380 

4 381 GGGGTGTAAAGTGAAGGCAGAAGAGGGTCAGCCGCATTGTTATGATACTGGGAAAGTGCT 44 4 0 

4 4 41 GGTCAACGATTTGAGTTAGTTTTTAGATATACATTGAAATCTTTAATCAGACATTCTCAA 4500 

4501 GTTTCACACAGTAGTTTTTGATGTTATGTACACACACACCAAATGTGTAACAGTTCACCA 4560 

4561 CTTCCAGAGTGTGGTCATGCCCAAAACATGTTTAAGAAAGGAAAGCAGTAGCTCCTTGCT 4620 

4 621 AACGATGTTTCAGGAGGTTTGGGGCACTTGGTTTTAATGAGCTTCTGTCATTTAGGGCTT 4 680 

4 681 CTCTTGGCCATGGTCCCCTTCCTTCTGGAACTGTGATGTAGTCACATCCTACAGCCTTTA 47 40 

4741 GTGCTGGTTCACTAGTGTCAGATAATCAGTTCTTGGAATCGAGACTGCCGTGGCGAAGGG 4 800 

4801 GTGGCCTCGGAGGCAGGCTCTGGAGCTGCTTGGATGTCTTTAGGTGGGGTGGTGGCTGGC 4 860 

4 8 61 TCTCTTCAGCATGTAATTGGGGAAACCCTCGCGTCTACTAGGGGTGATACAGATGGTGAT 4 920 

4 921 TTTAAAGAGCAAAACTAGACTTCTATGTGAGAAGTGCTGGAAAATGATTTAGGACATGTG 4 980 

4981 TAAAGTTAGATGGAAAGACTGTAAATGTTTAATATGAATATAGTGTTCTTTTGAAGTAAG 5040 

5041 GCCAGCTGTTGAACGGTTAAACTGTGCATTTCTCATTTTGATGTGTCATGTATGTTAATG 5100 

5101 TATGAAATGATTAAATAAAATCAAAACTGGTACCTGTTTATACATAAATACGAGAAAAGA 5160 
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Figure 19F 

5161 CCTATCTTTGCAGCCATAAACTCGGTGGGAACACCACCACTCAAGTTGCCAAAGGAGGCA 5220 
5221 GTGGTGAAACCTGTCCTGTTCTCACTTAAATGAGGATTTAGCTCAAAATAAAGTGGTGGT 5280 
5281 GTCATCAGGTTTATTCCGTGTTCTGTCATTCACATGGAACACCGGATGATTAGCTAACAG 534 0 
5341 TTTAGTGCCAGCCTTCATTCTTTACTGTGTACGTTAAATGCACACTACAGTGAAAAAGCC 54 00 
5401 TAAGACACTTGGTAAATATTTTCTAGCTGACTGATTCCAGAACACACAAG 5450 
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Figures 2 OA 



1 CCACGCGTCCGGCTCTTGCCTCCCAGTGCCATGCAGGTGCAGGATGCAACCAGGCGGCCC 60 



61 TCAGCCGTGCGCTTCCTCAGCTCCTTTCTCCAGGGCCGCCGGCACTCCACCTCAGACCCA 12 0 



121 GTACTGCGGCTGCAGCAGGCCCGGCGGGGCTCTGGCTTGGGCTCCGGCTCTGCCACGAAG 180 



181 CTGCTGTCCTCGTCCTCTCTCCAGGTGATGGTGGCTGTTT CCTCAGTCAGCCATGCAGAG 240 



241 GGAAACCCAACTTTCCCCGAAAGAAAAAGAAATTTAGAACGTCCAACACCAAAGTACACA 300 



301 AAAGTAGGGGAGCGTTTACGGCATGTCATTCCTGGACACATGGCATGTTCCATGGCGTGT 360 



361 GGCGGTAGAGCTTGCAAGTATGAGAACCCAGCCCGCTGGAGTGAGCAGGAGCAAGCCATT 420 



421 AAGGGGGTTTACTCATCCTGGGTCACTGATAATATACTGGCCATGGCCCGCCCATCCTCT 480 



:t: "581 GAGCTCCTGGAGAAGTACCACATCATTGATCAGTTCCTCAGCCATGGCATAAAAACAATA 540 

if 

yi, 541 ATCAACCTCCAGCGCCCTGGTGAGCATGCTAGCTGTGGGAACCCTCTGGAACAAGAAAGT 600 

n 

W 601 GGCTTCACATACCTTCCTGAGGCTTTCATGGAGGCTGGCATTTACTTCTACAATTTCGGA 660 

O ^ MEAGIYFYNFGll 

661 TGGAAGGATTATGGTGTAGCGTCTCTTACTACTATCCTAGATATGGTGAAGGTGATGACA 720 
12WKDYGVASLTTILDMVKVMT 31 



721 TTTGCCTTACAGGAAGGAAAAGTAGCTATCCATTGTCATGCAGGGrTTGGTCGAACAGGT 780 
32 F A L Q E G K V A I H "jC H A..G L G R T G 51 

7 81 GTTTTAATAGCCTGTTACTTAGTTTTTGCAACGAGAATGACTGCTGACCAAGCAATTATA 84 0 
52 V LIACYLVFATRMTA D Q A I I 71 



841 TTTGTGCGGGCAAAGCGACCCAATTCCATACAAACCAGAGGACAGCTCCTCTGTGTAAGG 900 
72FVRAKRPNSIQTRGQLLCVR 91 



901 GAATTTACTCAGTTTCTAACTCCTCTCCGCAATATATTCTCTTGCTGTGATCCCAAAGCA 960 
92EFTQFLTPLRNIFSCCDPKA 111 



961 CATGCTGTCACCTTACCTCAATATCTAATTCGCCAGCGTCATCTGCTTCATGGTTATGAG 1020 
112HAVTLPQYLIRQRHLLHGYE 131 



1021 GCACGACTTCTGAAACACGTGCCAAAAATTATCCACCTAGTTTGCAAATTGCTGCTGGAC 1080 
132ARLLKHVPKIIHLVCKLLLD 151 



1081 TTAGCGGAGAACAGGCCAGTGATGATGAAGGATGTGTCCGAAGGACCTGGTCTCTCTGCT 114 0 
152LAENRPVMMKDVSEGPGLSA 171 
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Figures 2 OB 



1141 GAAATAGAAAAGACAATGTCTGAGATGGTCACCATGCAGCTGGATAAAGAGTTACTGAGG 1200 
172 EIEKTMSEMVTMQLDKELLR 191 



1201 CATGACAGTGATGTGTCCAACCCGCCTAACCCCACTGCAGTGGCAGCAGATTTTGACAAT 1260 
192 HDSDVSNPPNPTAVAADFDN 211 



1261 CGAGGCATGATTTTCTCCAATGAGCAACAGTTTGACCCTCTTTGGAAAAGGCGGAATGTT 1320 
212 RGMIFSNEQQFDPLWKRRNV 231 



1321 GAGTGCCTTCAACCCCTGACTCATCTGAAAAGGCGGCTCAGCTACAGTGACTCAGATTTA 1380 
232 ECLQPLTHLKRRLSYSDSDL 251 



1381 AAGAGGGCCGAGAACCTCCTGGAGCAAGGGGAGACTCCACAGACAGTGCCTGCCCAGATC 1440 
252 KRAENLLEQGETPQTVPAQI 271 



1441 TTGGTTGGCCACAAGCCCAGGCAGCAGAAGCT CATAAGCCATTGTTACATCCCACAGTCT 1500 
272 LVGHKPRQQKLISHCyiPQS 291 



1501 CCAGAACCAGACTTACACAAGGAAGCCTTGGTTCGCAGCACACTTTCTTTCTGGAGTCAG 1560 
292 PEPDLHKEALVRSTLSFWSQ 311 



1561 TCAAAGTTTGGAGGCCTGGAAGGACTCAAAGATAATGGGTCACCAATTTTCCATGGAAGG 1620 
312SKFGGLEGLKDNGSPIFHGR 331 



1621 ATCATTCCAAAGGAAGCACAGCAGAGTGGAGCTTTCTCTGCAGATGTTTCAGGCTCACAC 1680 
332 IIPKEAQQSGAFSADVSGSH 351 



1681 AGCCCTGGGGAGCCAGTTTCACCCAGCTTTGCAAATGTCCATAAGGATCCAAACCCTGCT 1740 
352 S PGE PVS PS FANVHKDPNPA 371 



1741 CACCAGCAAGTGTCTCACTGTCAGTGTAAAACTCATGGTGTTGGGAGCCCTGGCTCTGTC 1800 
372 HQQVSHCQCKTHGVGSPGSV 391 



1801 AGGCAGAACAGCAGGACACCCCGAAGCCCTCTGGACTGTGGCTCCAGTCCCAAAGCACAG 1860 
392 RQNSRTPRSPLDCGSSPKAQ 411 



1861 TTCTTGGTTGAACATGAAACCCAGGACAGTAAAGATCTGTCTGAAGCAGCTTCACACTCT 1 920 
412FLVEHETQDSKDLSEAASHS 431 



1921 GCATTACAGTCTGAATTGAGTGCTGAGGCAAGAAGAATACTGGCGGCCAAAGCCCTAGCA 1980 
432 ALQSELSAEARRILAAKALA 451 



1981 AATTTAAATGAATCTGTAGAAAAGGAGGAACTAAAAAGGAAGGTAGAAATGTGGCAGAAA 2040 
452 NLNESVEKEELKRKVEMWQK 471 



2041 GAGCTTAATTCCCGAGATGGAGCTTGGGAAAGAATATGTGGCGAGAGGGACCCTTTCATC 2100 
472 ELNSRDGAWERICGERDPFI 491 
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Figures 20C 

2101 CTATGCAGCTTGATGTGGTCTTGGGTGGAGCAACTGAAGGAGCCTGTAATCACCAAAGAG 2160 
492 LCSLMWSWVEQLKEPVITKE 511 

2161 GATGTGGACATGTTGGTTGACAGGCGAGCAGATGCCGCAGAAGCACTTTTTTTATTAGAG 2220 
512DVDMLVDRRADAAEALFLLE 531 

2221 AAGGGACAGCACCAGACTATTCTCTGCGTGTTGCACTGCATAGTGAACCTGCAGACAATT 2280 
532 KGQHQTILCVLHCIVNLQTI 551 

2281 CCCGTGGATGTGGAGGAAGCTTTCCTTGCCCATGCCATTAAGGCATTCACTAAGGTTAAT 2340 
552 PVDVEEAFLAHAIKAFTKVN 571 

2341 TTTGATTCTGAAAATGGACCAACAGTTTACAACACCCTGAAGAAAATATTTAAGCACACG 2400 
572 FDSENGPTVYNTLKKIFKHT 591 

2401 CTGGAAGAAAAAAGAAAAATGACAAAAGATGGCCCTAAGCCTGGCCTCTAGCTTTCACTC 2460 
592 LEEKRKMTKDGPKPGL* 607 

2461 ATGGTGAATATTTCAGACCTAAAGATCCAGATAGTATCTCTGTTCATATGTGAATAAGTT 2520 

2521 GAAGATTGTGGGGCTACTTTTTCTCATAGCACTTTATTTTGAATGTTGTTAGTTTGTGCT 2580 

2581 GAGAATGGTCGTCCGTATTTGAACCAATTATTTATTTTAAAATATATTTAAGCTACATTT 2 640 

2 641 TTGTTTTGAAAAATTGCCATAAATTTGGTGCCACTTTCTTTTATTTATTTGACTGAGTTA 2700 

2701 ATATTATTGTATTAACATTTTAAGTATATGGTGTTTACATTCTTATTTCTTTTGACATTT 27 60 

27 61 TGGAAATAATCATAACTTGTCTTTCCAAAATAACCATTTTCTTGATGGAACTCTTCCTAG 2820 

2821 AGTTTTTACCAAATAGCTAACTTTAGTAGTAAAACCTCATTGTGTATCCATTCCCCCACA 2880 

2881 GATGAACTAAGAAAGTCACCAAGTGTCTTAAGCTGTTTTATATTTGTTACGAAGAAGGCT 2940 

2941 ATTGCTACAATATTTTTAAAGGTTTCTTTTTTAACTTTGAAATTTTTTGTTTTTCCTTTT 3000 

3001 CTTTTTATAAATGTAACAGAGGGTTTCAAAGCATATTATTTTTCAGAGAGATTTAGTTTT 3060 

3061 ACTTTAATGGAGTGACTGTGAAGTGGTTGGGATTTTTTGCTTGTAGAAAGTAGACTTGCT 3120 

3121 CTTTGTCAGATTTCCAAACAACCTTGCCAGCCTTGGCTGTCAAAAGGAGGCAGGAGCAGT 3180 

3181 TCTCAACACACCAAGCCTTATTCCCACTCCCTTGGGTTGCTGCTGAGCCAAATAGCATCT 3240 

3241 TTACAGAGGAAGTGGGATCAGAGGCAGGAAGTGTGGAAAGTTGCTAAGAAGCAGGGCTTG 3300 
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3301 CCTCTGTCCTCCCGGGGACTCCACAGGGATATTCGTGCAGGGCAGGGGCTCTGTGCCAGC 3360 
3361 CCTGCTCTCTCAGATGCCACAGCCACTCTGCAGAGGTGACTCTTGGAGCTGGAGGAAGTC 3420 
3421 AAAACTGGGCCACTGTTTGTACTGATGGTGTATT AGCATGAGCAGCGTGGCCCTGGCCCC 3480 
3481 ACACTCCCAAATCTGCCACTCCATAGACCCACTTGCCTCAAGGCTTTATATTTGGCTGCT 3540 
3541 TTCTTACAATGAGAATTAAGATTTTTAAACTGAAGTTGACCATACAGGTTGCATTAGCCC 3600 
3 601 TAACTGGCTTCATGTAAGAAGGGTGACTGCCTAAACTAGTTCCTTGTAAGCTGAACCATC 3660 
3661 AATTATCAGTTGAAGCCATACTTTTATTTAAATTAATATACGTAGATACCAGAGGCCAAG 3720 
3721 CCACAGAGAGGATAATAGTTCTTCCCAATiw^GGTGATATTAATCAGACTAATTTCGAAC 3780 
3781 TAAAGAAGTTACTGCTTAAAGACGGAATTTCAGGGGAAGCAAGACTCATTTAGAACAAAT 3840 
3841 GAAATTTCTCCAGTCCTACATTTCTGAATTGACTTCTAGCACATCAAAAATATTTCAGTC 3900 
3901 ATTATCAGTCTCATTAACTGAAATGCCAAATGCTAAATGCAGTGTTCTTTCACACTGTTT 3960 
3961 TAATTTTCTTGGGAAATTGAGTCCAGTGGATGTTAATGGAGTGGGTTGCCCATCCCTGAA 4020 
4 021 ATGTCTTATTTTCAAGTGCCTGGCCTGGGAAAGAAGGGGAAGAAACAATTGCATTATATC 4080 
4 081 CAAAGATACACTATAAAAATAGAGTTTTTACCAAAAAAAGATGTTTGTTCTCATCTCAGT 414 0 
4141 AGGCCTCATTTGGGCAAGTGACCCACAGGTCTTTTGGCGAGTTTGCTATTTGCCTGTTGA 4200 
4201 AATACTTGTTTCAACTTAGAGAACAGTTATGATGTGACCATAGCATGGCACAACTAAAAA 42 60 
4261 TCTAAGCCTGAAACCTGAAAAAAGAGATATGACAAGGGAAATTAATCAGGCTATACATAA 4320 
4321 GTATTGTATTTATTTGAATAAAAATAAAAAGAGCAACCCATAAAAAAAAAAAAAAAAAAA 4380 
4381 AAAAAAAAAAAAG 4393 
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Figures 21 

1 CCACGCGTCCGGCGAGGGGACGCGTGGGCGGAGCGGGGCTGGCCAGCCTCGGCCCCCATG 60 

61 ACCCGCTGTCCTGTGCCCTTTCCCAGCGATGGGCGTGCAGCCCCCCAACTTCTCCTGGGT 120 
1 MGVQPPNFSWVll- 

121 GCTTCCGGGCCGGCTGGCGGGACTGGCGCTGCCGCGGCTCCCCGCCCACTACCAGTTCCT 180 
11LPGRLAGLALPRLPAHYQFL31 

181 GTTGGACCTGGGCGTGCGGCACCTGGTGTCCCTGACGGAGCGCGGGCCCCCTCACAGCGA 24 0 
31LDLGVRHLVSLTERGPPHSD51 



241 CAGCTGCCCCGGCCTCACCCTGCACCGCCTGCGCATCCCCGACTTCTGCCCGCCGGCCCC 300 
51SCPGLTLHRLRIPDFCPPAP71 



301 CGACCAGATCGACCGCTTCGTGCAGATCGTGGACGAGGCCAACGCACGGGGAGAGGCTGT 360 
71DQIDRFVQIVDEANARGEAV91 



361 GGGAGTGCACTGTGCTCTGGGCTTTGGCCGCACTGGCACCATGCTGGCCTGTTACCTGGT 420 
91 G V A L G F G ^ G T M L A C Y L V 111 

421 GAAGGAGCGGGGCTTGGCTGCAGGAGATGCCATTGCTGAAATCCGACGACTACGACCCGG 480 
111KERGLAAGDAIAEIRRLRPG131 

481 CTCCATCGAGACCTATGAGCAGGAGAAAGCAGTCTTCCAGTTCTACCAGCGAACGAAATA 540 
131SIETYEQEKAVFQFYQRTK*150 



541 AGGGGCCTTAGTACCCTTCTACCAGGCCCTCACTCCCCTTCCCCATGTTGTCGATGGGGC 600 
601 CAGAGATGAAGGGAAGTGGACTAAAGTATTAAACCCTCTAGCTCCCATTGGCTGAAGACA 660 
661 CTGAAGTAGCCCACCCCTGCAGGCAGGTCCTGATTGAAGGGGAGGCTTGTACTGCTTTGT 720 
721 TGAATAAATGAGTTTTACGAACCAGGAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 780 



781 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 840 



841 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAGGGC 878 
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Figure 22 
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Figure 24. 



BMY HPPl 





Protein 


Genbank 
ID 


Identities 


Similarities 




human protein tyrosine 
phosphatase 


gi| P32587 


27% 


39.6% 




mouse protein tyrosine 
phosphatase 


gi| NP_035346 


27.9% 


40.5% 




Schizosacchromyces Pombe 
protein tyrosine phosphatase 
PYP3 protein 


gi| NP_002839 


27.5% 


36.7% 




BMY HPP2 






Protein 


Genbank 
ID 


Identities 


Similarities 




human S. cerevisiaeCDCM 
homolog A 


gi| NP_003663 


33.1% 


44.1% 




human S. cerevisiae CDC 14 
homolog B 


gil NP_003662 


33.1% 


45.8% 




yeast soluble tyrosine- 
specific protein phosphatase 
Cdcl4p protein 


gi| NP_002839 


33.1% 


45.8% 
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RET31 



Protein 


Genbank 
ID 


Identities 


Similarities 


human protein-tyrosine 
phosphatase DUS8 protein 


gi| U27193 


50.3% 


56.8% 


the human dual specificity 
MAP kinase DUSP6 protein 


gi| AB013382 


36.5% 


48.3% 


human map kinase 
phosphatase MKP-5 protein 


gi| AB026436 


34.3% 


47.2% 


mouse RET31 protein 


N/A 


90% 


92% 




nrmTmniii 
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HPPl 




MAAGVLPQNE 


QPYSTLVNNS 


EC . VANMKGN 


LERPTPKYTK 


pdblaax 


MEMEKEFEQI 


DKSGSWAAIY 


QDIRHEASDF 


PCRVAKLPKN 


KNRNRYRDVS 


HPPl 


VGERLRHVIP 


GHMACSMACG 


GRACKYENPA 


RWSEQEQAIK 


GVYSSWVTDN 


pdblaax 


PFDH. .SRIK 


LHQEDNDYIN 


ASLIKME 


EAQRS 


YILTQGPLPN 


HPPl 


ILAMARPSSE 


LLEKYHIIDQ 


FLSHGIKTII 


NLQRPGE . . H 


ASCGNPLEQE 


pdblaax 






VWEQKSRGVV 


MLNRVMEKGS 


LKCAQYWPQK 


HPPl 






PEAFMEAG, 




, , lYFYNFG 


pdblaax 


EEKEMIFEDT 


NLKLTLISED 


IKSYYTVRQL 


ELENLTTQET 


REILHFHYTT 



HPPl WKDYGVA.SL TTILDMVKVM TFALQE GKVAIHCHAG LGRTGVLIAC 203 

pdblaax WPDFGVPESP ASFLNFLFKV RESGSLSPEH GPWVHSSAG IGRSGTFCLA 228 

HPPl YLVFATR MTADQ AIIFVRAKRP NSI....QTR GQLLCVREFT 241 

pdblaax DTCLLLMDKR KDPSSVDIKK VLLEMRKFRM GLIQTADQLR FSYLAVIEGA 278 

HPPl QFLTPLRNIF SCCDPKAHAV TLPQYLIRQR HLLHGYEARL LKHVPKIIHL 291 

pdblaax KFIM GDSSVQDQWK ELSHEDLEPP PGHIPPPPRP 312 



HPPl 
pdblaax 



VCKLLLDLAE NRPVMMKDVS EGPGLSAEIE KTMSEMVTMQ LDKELLRHDS 
PKRILEPHN 



301 
321 
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Figure 28 




HPPl Homology Model 
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Figure 29 



Template (lAAX) vs Model 




Residue Number 




PUB|6-1BUSJPB 
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Figure 31 



20 30 40 50 60 70 

pdblvhrA GCYSLPSQPCNEVTPRIYVGNASVAQDIPKLQKLGITHVLNAAEGRSFMHVNTNANFYKD 



BMY_HPP2 MGVQPPNFSWVLPGRLAGLALPRLPAHYQFLLDLGVRHLVSLTE-RGPPHSDSCP 

10 20 30 40 50 

80 90 100 110 120 130 

pdblvhrA SGITYLGIKANDTQEFNLSA — YFERAADFIDQALAQKNGRVLVHCREGYSRSPTLVIAY 

BMY_HPP2 -GLTLHRLRIPD FCPPAPDQIDRFVQIVDEANARGEA-VGVHCALGFGRTGTMLACY 

60 70 80 90 100 

140 150 160 170 180 

pdb 1 vhr A LMMRQKMDVKS ALS I VRQNRE IGPNDGFLAQLCQLNDRLAKEGKLKP 

BMY_HPP2 LVKERGLAAGDAIAEIRRLRPGSIETYEQEKAVFQFYQRTK 
110 120 130 140 150 
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HPP2 Homology Model 
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Template (IVHR) vs Model 
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Figure 36 
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Figure 37 



10 20 30 

pdblmkp ASFPVEILPFLYLGCAKDSTNLDVLEEFGIKYI 



BMY_HPP5 SRCFPGLCEGKSTLVPTCISQPCLPVANIGPTRILPNLYLGCQRDVLNKELMQQNGIGYV 
130 140 150 160 170 180 

40 50 60 70 80 90 

pdblmkp LNVTPNLPNLFENAGEFKYKQIPISDHWSQNLSQFFPEAISFIDEARGKNCGVLVHSLAG 

BMY_HPP5 LNASNTCPKP-DFIPESHFLRVPVNDSFCEKILPWLDKSVDFIEKAKASNGCVLVHCLA6 
190 200 210 220 230 240 

100 110 120 130 140 

pdblmkp ISRSVTVTVAYLMQKLNLSMNDAYDIVKMKKSNISPNFNFMGQLLDFERTL 

BMY_HPP5 I SRS AT lAIAYIMKRMDMSLDEAYRFVKEKRPT I S PNFNPLGQLLAYEKKIKNQTGASGP 
250 260 270 280 290 300 

BMY_HPP5 KSKLKLLPLEKPNEPVPAVSEGGQKSETPLSPPCADSATSEAAGQRPVHPASVPSVPSVQ 
310 320 330 340 350 360 
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HPP5 Homology Model 
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Template (IMKP) vs Model 
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